Why Does Bagging Work? A Bayesian Account and its Implications

نویسنده

  • Pedro M. Domingos
چکیده

The error rate of decision-tree and other classi-cation learners can often be much reduced by bagging: learning multiple models from bootstrap samples of the database, and combining them by uniform voting. In this paper we empirically test two alternative explanations for this, both based on Bayesian learning theory: (1) bagging works because it is an approximation to the optimal procedure of Bayesian model averaging, with an appropriate implicit prior; (2) bagging works because it eeectively shifts the prior to a more appropriate region of model space. All the experimental evidence contradicts the rst hypothesis, and connrms the second. Bagging Bagging (Breiman 1996a) is a simple and eeective way to reduce the error rate of many classiication learning algorithms. For example, in the empirical study described below, it reduces the error of a decision-tree learner in 19 of 26 databases, by 4% on average. In the bagging procedure, given a training set of size s, a \bootstrap" replicate of it is constructed by taking s samples with replacement from the training set. Thus a new training set of the same size is produced, where each of the original examples may appear once, more than once, or not. On average, 63% of the original examples will appear in the bootstrap sample. The learning algorithm is then applied to this training set. This procedure is repeated m times, and the resulting m models are aggregated by uniform voting. Bagging is one of several \multiple model" approaches that have recently received much attention (see, for example , (Chan, Stolfo, & Wolpert 1996)). Other procedures of this type include boosting (Freund & Schapire 1996) and stacking (Wolpert 1992). Two related explanations have been proposed for bagging's success, both in a classical statistical framework. Breiman (1996a) relates bagging to the notion of an order-correct learner. A learner is order-correct for an example x if, given many diierent training sets, it predicts the correct class for x more often than any other. Breiman shows that, given suucient replicates, bagging turns an order-correct learner into a nearly-optimal one. Although this line of reasoning has intuitive value, its usefulness is limited, because it is seldom (or never) known a priori whether a learner is order-correct for a given example or not, or what regions of the instance space it will be order-correct in and not. Thus it is not possible to judge from an application domain's characteristics whether bagging …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

Lossless Online Bayesian Bagging

Bagging frequently improves the predictive performance of a model. An online version has recently been introduced, which attempts to gain the benefits of an online algorithm while approximating regular bagging. However, regular online bagging is an approximation to its batch counterpart and so is not lossless with respect to the bagging operation. By operating under the Bayesian paradigm, we in...

متن کامل

Neighbourhood sampling in bagging for imbalanced data

Various approaches to extend bagging ensembles for class imbalanced data are considered. First, we review known extensions and compare them in a comprehensive experimental study. The results show that integrating bagging with under-sampling is more powerful than over-sampling. They also allow to distinguish Roughly Balanced Bagging as the most accurate extension. Then, we point out that complex...

متن کامل

The Saying/Showing Distinction in Early Wittgenstein and Its Implications

 Jafar Morvarid[1]  In this paper, I shall try to clarify the saying/showing distinction and to emphasize the role of this distinction in constructing a coherent picture of language and the world. In order to properly understand the differences between the sayable and the showable, I will throw light on the limits of language and the world. I will explain why it is impossible to say the showab...

متن کامل

Pricing and hedging derivative securities with neural networks: Bayesian regularization, early stopping, and bagging

We study the effectiveness of cross validation, Bayesian regularization, early stopping, and bagging to mitigate overfitting and improving generalization for pricing and hedging derivative securities with daily S&P 500 index daily call options from January 1988 to December 1993. Our results indicate that Bayesian regularization can generate significantly smaller pricing and delta-hedging errors...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997